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We present a search for the standard model Higgs boson produced in association with a Z boson 
in 4.2 fb^ 1 of pp collisions, collected with the DO detector at the Fermilab Tevatron at \fs = 1.96 
TeV. Selected events contain one reconstructed Z — > l + £~ candidate and at least two jets, including 
at least one 6-tagged jet. In the absence of an excess over the background expected from other 
standard model processes, limits on the ZH cross section multiplied by the branching ratios are set. 
The limit at Mh = 115 GeV is a factor of 5.9 larger than the standard model prediction. 

PACS numbers: 14.80.Bn, 13.85. Qk, 13.85.Rm 



In the standard model (SM), the spontaneous break- 
down of the electroweak gauge symmetry generates 
masses for the W and Z bosons and produces a scalar 
massive particle, the Higgs boson, which has so far eluded 
detection. The discovery of the Higgs boson would top a 
remarkable list of experimentally confirmed SM predic- 
tions. 

For Higgs boson masses Mh below 135 GeV, the pri- 
mary Higgs boson decay in the SM is H — > bb, which is 
challenging to discern amidst copious bb production at 
the Tevatron pp collider. Consequently, sensitivity to a 



low-mass Higgs boson is predominantly from its produc- 
tion in association with a W or Z boson that decays to 
leptons. 

In this Letter, we present a search for ZH — > £ + £~bb, 
where I is either a muon or an electron. The searches 
for ZH — > vvbb and ZH — > T + r~bb are treated else- 
where ja, |9|. For the £ + £~bb final states, the DO collabo- 
ration has previously used 0.45 fb _1 of integrated lumi- 
nosity to report a cross section upper limit at the 95% 
CL that was around 25 times larger than the SM predic- 
tion at M H = 115 GeV [nj, and the CDF collaboration 
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used 2.7 fb to obtain a factor of around 8 [llj . 

The data for this analysis were collected at the Fer- 
milab Tevatron Collider with the DO detector [12J. Af- 



ter imposing data quality requirements, the integrated 
luminosity is 4.2 fb^ 1 . The selected events were pre- 
dominantly acquired by triggers that provide real-time 
identification of electron and muon candidates, but to 
maximize acceptance, events from all available triggers 
are considered. 

The selection of signal-like events requires a primary pp 
interaction vertex (PV) that has at least three associated 
tracks and is located within 60 cm of the center of the 
detector along the direction of the beam. Selected events 
must also contain a Z boson candidate with a dilepton 
invariant mass 60 < rn« < 150 GeV. 

The dimuon (fifj) selection requires at least two muons 
matched to central tracks with transverse momenta pt > 
10 GeV. Combined tracking and calorimeter isolation re- 
quirements are applied to the muon pair such that one 
muon does not need to be isolated if the other is suf- 
ficiently well isolated. For each muon track, the pseu- 
dorapidity ?7dct, measured with respect to the center of 
the detector, must satisfy r/det| < 2 [13]. At least one 
muon must have |^det| < 1-5 and pt > 15 GeV. The 
distance of closest approach of each track to the PV in 
the plane transverse to the beam direction, dpv, must be 
less than 0.02 cm for tracks with at least one hit in the 
silicon microstrip tracker (SMT). A track without SMT 
hits must have dpy < 0.2 cm, and its pt is corrected 
through a constraint to the position of the PV. An ad- 
ditional dimuon selection, fifitrk, requires one identified 
muon and one isolated track (pti±) in the central track- 
ing detector with pt > 20 GeV and |??det| < 2, at least 
one hit in the SMT, and cZ PV < 0.02 cm [14]. The ,«trk 
must be separated in pseudorapidity rj and azimuth <f> 
by A1Z = ^/(Arj) 2 + (Ac/)) 2 > 0.1 from the other muon. 
The /x/ztrk selection adds 10% signal acceptance to the 
fj.fi selection, mainly from gaps in the muon detector. To 
reduce contamination from cosmic rays, the tracks from 
both selections must not be back-to-back in r) and </>. The 
two muons must also have opposite charge. 

The dielectron (ee) selection requires at least two elec- 
trons of pt > 15 GeV identified by electromagnetic show- 
ers in the calorimeter. Each shower must be isolated 
from other energy depositions and have a shape consis- 
tent with that expected of an electron. At least one elec- 
tron must be identified in the central calorimeter (CC, 
|?7dct| < 1-1), and a second electron either in the CC or 
the end calorimeter (EC, 1.5 < |?7dot| < 2.5). The CC 
electrons must match central tracks or produce a pattern 
of hits in the tracker consistent with that expected of an 
electron. An additional dielectron selection, eeicR, re- 
quires exactly one electron from the CC or EC, with a 
second electron identified as a narrow calorimeter cluster 
in the inter-cryostat region (ICR, 1.1 < ?7dctl < 1-5) with 
a matching track in the central tracker [lfl- A neural 
network (NNicr) is used to differentiate ICR electrons 
from jets. The eeicR selection requires an explicit single- 



electron trigger, and adds 17% signal acceptance to the 
ee selection. 

Jets are reconstructed in the calorimeter using the iter- 
ative midpoint cone algorithm [l6| with a cone of radius 
0.5. The energy scale of jets is corrected for detector 
response, the presence of noise and multiple pp interac- 
tions, and energy deposited outside of the reconstructed 
jet cone. At least two jets with |?7dct| < 2.5 are required, 
with the leading jet of pt > 20 GeV and additional jets 
of pt > 15 GeV. Both electrons in dielectron events are 
required to be isolated from any jet by A1Z > 0.5. Like- 
wise, jets must be separated by A1Z > 0.5 from the /x t rk 
candidate in the /i/itrk channel, but no such requirement 
is applied to the muon candidates in either dimuon chan- 
nel. To reduce the impact from multiple pp interactions 
at high instantaneous luminosities, jets must contain at 
least two tracks matched to the PV. 

To distinguish the decay H — > 66 from background pro- 
cesses involving light quarks and gluons, jets are identi- 
fied as likely containing 6-quarks (6-tagged) if they pass 
loose or tight requirements on the output of a neural net- 
work trained to separate 6-jets from light jets [17j . For 
\rf\ < 0.7 and pt > 45 GeV, the 6-tagging efficiency for 
6-jets and the misidentification rate of light jets are, re- 
spectively, 74% and 8.5% for loose 6-tags, and 48% and 
0.6% for tight 6-tags. Events with at least two loose 6- 
tags are classified as double-tagged (DT). Events not in 
the DT sample that contain a single tight 6-tag are clas- 
sified as single-tagged (ST). The dijet H — >• bb candidate 
is composed of the two highest pt 6-tagged jets in DT 
events, and the 6-tagged jet plus the highest pr non-6- 
tagged jet in ST events. 

The background from multijet events with jets 
misidentified as leptons is estimated from control sam- 
ples in the data. For the fifi channel, the multijet control 
sample contains events that fail the muon isolation re- 
quirement but otherwise pass the event selection. In the 
/X/Utrk multijet control sample, the fi and /itrk are required 
to have the same charge. For the ee channel, the electrons 
must fail isolation and shower shape requirements. The 
resulting trigger bias is corrected by reweighting distri- 
butions in lepton px and rf to match an unbiased control 
sample. Misidentified ICR electrons in the eeicR chan- 
nel are selected from a background region of the NNicr 
output. 

The dominant background process is the production of 
a Z boson in association with jets, with the Z boson de- 
caying to dileptons (Z+jets). The light-flavor component 
(ij+LF) includes jets from only light quarks (uds) or glu- 
ons. The heavy-flavor component (Z+HF) includes non- 
resonant Z+bb production, which has the same final state 
as the signal, and Z + cc. The remaining backgrounds 
are from top quark pair (it) and diboson production. We 
simulate ZH — > £ + £~bb and inclusive diboson produc- 
tion with pythia [18| and Z+jets and ti -» i + vb£~vb 
processes with ALPGEN [H], using the CTEQ6L1 [20I ] 
leading-order parton distribution functions (PDFs). The 
events generated with ALPGEN are input to PYTHIA for 
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parton showering and hadronization, and can contain ad- 
ditional jets. For these events, we use a matching pro- 
cedure to avoid double counting partons produced by 
ALPGEN and those subsequently added by the shower- 
ing in pythia 19]. All samples are processed using a 
detector simulation program based on GEANT3 [2l| , and 
the same offline reconstruction algorithms used to pro- 
cess the data. Events from randomly chosen beam cross- 
ings are overlaid on the simulated events to reproduce 
the effect of multiple pp interactions and detector noise. 

The cross section and branching ratio for the signal 
are taken from Refs. [HI, For the tt and diboson 

processes, the cross sections are taken from MCFM [24| . 
calculated at next-to-leading order (NLO). The inclusive 
Z boson cross section is scaled to next-to-NLO t 25] , with 
additional NLO heavy-flavor corrections calculated from 
mcfm applied to Z + bb and Z + cc. 

Corrections are applied to the simulated events to im- 
prove the modeling. The simulated eeicR, and /z/Ktrk 
events are weighted by trigger efficiencies measured in 
data. For the ee channel, no correction is applied as 
the combination of lepton and jet triggers is nearly 100% 
efficient. Lepton identification efficiencies are corrected 
as a function of ?7det and </> of the lepton. Jet ener- 
gies are modified to reproduce the resolution observed in 
data. Scale factors are applied to correct for differences 
in jet reconstruction efficiency between data and simula- 
tion. To model the 6-tagged samples, simulated events 
are weighted by their probability to satisfy the ST or DT 
criteria as measured in data. 

The performance of the background model is evaluated 
in control samples with negligible signal contributions 
that are obtained by applying only the lepton selection 
requirements (inclusive) or all selection requirements ex- 
cept 6-tagging (pretag). The simulated Z boson events 
are reweighted such that the px distribution of the Z 
boson is consistent with the observed distribution [26| . 
To improve upon the ALPGEN modeling of Z+jets, mo- 
tivated by a comparison with the SHERPA generator [13] , 
the pseudorapidities of the two jets with the highest px, 
and the between them are reweighted to match the 
distributions measured in the pretag data. 

Normalization factors for the simulated and the mul- 
tijet samples are determined from a fit to the m« dis- 
tributions in the inclusive and pretag data. This im- 
proves the accuracy of the background model and reduces 
the impact of systematic uncertainties that affect pretag 
event yields (e.g., uncertainties on luminosity). The re- 
gion 40 < mil < 60 GeV, where the multijet contribution 
is most prominent, is included in the fit to normalize the 
multijet control sample to the multijet contribution. The 
inclusive control sample constrains the lepton trigger and 
identification efficiencies, while the pretag control sam- 
ple, which includes jet requirements, constrains a com- 
mon scale factor fc^+jets that corrects the Z+jets cross 
section. The total event yields after applying all correc- 
tions and normalization factors are shown in Tabic [I] 

A multivariate analysis combines the most significant 



kinematic information into a single discriminant [28| . 
Each decision tree in a random forest (RF) [29] is trained 
to separate signal from background using a randomly 
selected subsample of simulated events. In addition, a 
random subset of input variables is considered for each 
decision in each tree. The RF output is a performance- 
weighted average of the output from each decision tree. 
To exploit the kinematics of the ZH — >• £ + £~bb process, 
the energies of the candidate leptons and jets are ad- 
justed within their experimental resolutions with a y 2 fit 
that constrains mu to the mass and width of the Z bo- 
son, and the px of the £ + £~bb system to the expected 
distribution for ZH events before detector resolution ef- 
fects [l4|. The variables selected for the RF are: the 
transverse momenta of the two 6-jet candidates and the 
dijet invariant mass, before and after the jet energies are 
adjusted by the kinematic fit; angular differences within 
and between the dijet and dilepton systems; the angle 
between the proton beam and the Z boson candidate in 
the rest frame of the £ + £~bb system [3(J; and composite 
kinematic variables such as the px of the dijet system 
and the scalar sum of the transverse momenta of the lep- 
tons and jets. The RF outputs with all lepton channels 
combined are shown separately for ST and DT events in 
Figs.IHa,b). 

Systematic uncertainties resulting from the back- 
ground normalization are assessed for the multijet con- 
tribution (20-60% depending on channel) and for effects 
of lepton efficiency (2-10%), some of which are corre- 
lated between all lepton channels (6%). The normal- 
ization of the Z+jets sample to the pretag data con- 
strains the Z+jets cross section multiplied by any jet- 
dependent efficiency to within the statistical uncertainty 
of the pretag data (1-2%). Additional systematic un- 
certainties (10-20%) for possible jet-dependent efficiency 
effects absorbed into kz+- jC ts are applied to the tt, dibo- 
son and ZH samples. The normalization to the pretag 
data, which is dominated by Z+LF, does not strongly 
constrain the cross sections of other processes. A cross 
section uncertainty of 20% for Z+HF and 6%-10% for 
other backgrounds is determined from Ref. 24j. For the 
signal, the uncertainty is 6% [H|. The normalization 
to the dilepton mass distributions reduces the impact of 
many of the remaining systematic uncertainties on the 
background size (except those related to 6-tagging), but 
changes to the shape of the RF output distribution per- 
sist and are accounted for. Additional sources of sys- 
tematic uncertainty include: jet energy scale, jet energy 
resolution, jet identification efficiency, 6-tagging and trig- 
ger efficiencies, PDFs, data-determined corrections to the 
model for Z+jets, and modeling of the underlying event. 
The uncertainties from the factorization and renormal- 
ization scales in the simulation of Z+jets are estimated 
by scaling these parameters by factors of 0.5 and 2. 

No significant excess above the background expecta- 
tion is observed. Therefore, we set limits on the ZH pro- 
duction cross section with a modified frequentist (CLs) 
method that uses a negative log likelihood ratio (LLR) 
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Data Total Background Multijet Z+LF Z+HF Other ZH 

inclusive 865254 853976 131905 701516 19074 1481 9AA 

pretag 31336 30634 3449 23234 3459 491 6.82 

ST 728 707 ± 130 48.4 161 443 54.1 1.87 ±0.25 

DT 485 435 ± 68 29.5 106 237 61.8 2.34 ± 0.36 



TABLE I: Expected and observed event yields for all lepton channels combined after requiring two leptons 
(inclusive), after also requiring two jets (pretag), and after requiring at least one tight (ST) or two loose (DT) 
6- tags. The total statistical and systematic uncertainties are indicated for the "Total Background" and "ZH" 
columns of the ST and DT samples. The "Other" column includes diboson and ti event yields. The ZH sample 
yields are for Mh = 115 GeV. 




0.1 ' 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

RF Output RF Output RF Output 



FIG. 1: Data and background RF outputs trained for a Higgs boson with Mh = 115 GeV in (a) ST and (b) DT 
samples. The (c) background-subtracted combination of ST and DT samples, with the systematic uncertainty bands 
before and after the fit performed by the limit-setting program. 



of the signal-plus-background (S±B) hypothesis to the 
background-only (B) hypothesis [31] . The RF output 
distributions and corresponding systematic uncertainties 
of the ST and DT samples from each leptonic channel 
and from two distinct data taking periods are analyzed 
separately by the limit setting program to take advantage 
of the sensitivity in the more discriminating channels. To 
minimize the impact of the systematic uncertainties, the 
likelihood of the B and S±B hypotheses are each maxi- 
mized by independent fits that vary nuisance parameters 
used to model the systematic effects [32|]. The corre- 
lations among systematic uncertainties are maintained 
across channels, as well as backgrounds and signal. The 
background-subtracted RF distribution, combined for all 
channels, with systematic uncertainty bands both before 
and after the fitting procedure, is shown in Fig. [TJ:. 

Figure[2]shows the observed LLR as a function of Higgs 
boson mass. Also shown are the expected (median) LLRs 
for the B and S±B hypotheses, together with the one 
and two standard deviation bands of the background- 
only expectation. A signal-like excess would result in a 
negative value of observed LLR. The data are consistent 
with either hypothesis for the entire mass range 100 < 
M H < 150 GeV. The 95% CL upper limit on the cross 
section times branching ratio, expressed as a ratio to the 
SM prediction, for each Mh is presented in Table [TTJ 
At Mh — 115 GeV, the observed (expected) limit on 
this ratio is 5.9 (7.1). Compared to the previous best 
expected limit in this channel [ll| , this represents a 40% 
improvement. 

Supplementary material detailing the pretag control 
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FIG. 2: Observed LLR as a function of Higgs boson 
mass. Also shown are the expected LLRs for the B and 
S±B hypotheses, together with the one and two 
standard deviation (s.d.) bands of the background-only 
expectation. 



sample, the effect of the kinematic fit, and additional 
cross section limits and LLR distributions from individ- 
ual lepton channels is available at [Hj]. 
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TABLE II: The expected and observed 95% CL upper limits on the SM Higgs boson production cross section for 
ZH — > £ + £~bb, expressed as a ratio to the SM cross section. The corresponding observed limits on the ZH 
production cross section multiplied by the branching ratio of H — > bb are also reported (in fb). 
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FIG. 3: Pretag distributions for (a) the dilepton invariant mass, (b) the dijet invariant mass after the kinematic fit, 
(c) the RF discriminant trained for ST events, (d) the RF discriminant trained for DT events, for all lepton channels 
combined, (e) and (f) reproduce (c) and (d) using a logarithmic scale. 
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FIG. 4: Dijet invariant mass distributions before the kinematic fit in (a) ST events, and (b) DT events; and 
calculated from jet energies as adjusted by the kinematic fit in (c) ST events and (d) DT events combined for all 
lepton channels. The ZH signal shown is for Mjj = 115 GeV. 
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FIG. 5: The expected and observed 95% C.L. cross section limit divided by the SM Higgs boson production cross 
section as a function of My (a) for Mjj < 150 GeV and (b) for Mjj < 130 GeV. Limits are for the combination of 
the DT and ST samples in all lepton channels. 
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FIG. 6: Observed LLR as a function of Mjj for the (a) ee, (b) fifx, (c) eeicR, and (d) n^trk channels. Also shown 
are the expected LLRs for the B and S+B hypotheses, together with the one and two standard deviation (s.d.) 
bands about the background-only expectation. 




FIG. 7: Expected and observed 95% CL cross section limit divided by the SM cross section as a function of Mh for 
the (a) ee, (b) /x/x, (c) eeicR, and (d) ix/xtrk channels. 



